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Abstract. We provide a solution to the problem of receding horizon control for sto- 
chastic discrete-time systems with bounded control inputs and imperfect state mea- 
surements. For a suitable choice of control policies, we show that the finite-horizon 
optimization problem to be solved on-line is convex and successively feasible. Due to 
the inherent nonlinearity of the feedback loop, a slight extension of the Kalman filter 
is exploited to estimate the state optimally in mean-square sense. We show that the 
receding horizon implementation of the resulting control policies renders the state of 
the overall system mean-square bounded under mild assumptions. Finally, we discuss 
how some of the quantities required by the finite-horizon optimization problem can 
be computed off-line, reducing the on-line computation, and present some numerical 
examples. 



1. Introduction 

A considerable amount of research effort has been devoted to deterministic receding 
horizon control, see, for example, [Bemporad and Morari, 1999; Lazar et al., 2007; Ma- 
ciejowski, 2001; Mayne et al., 2000; Qin and Badgwcll, 2003] and references therein. Cur- 
rently we have readily available conclusive proofs of successive feasibility and stability 
of receding horizon control laws in the noise-free deterministic setting. These techniques 
can be readily extended to the robust case, i.e., whenever there is an additive noise of 
bounded nature entering the system. The counterpart for stochastic systems subject to 
process noise and imperfect state measurements and bounded control inputs, however, is 
still missing. The principal obstacle is posed by the fact that it may not be possible to de- 
termine an a priori bound on the support of the noise, e.g., whenever the noise is additive 
and Gaussian. This extra ingredient complicates both the stability and feasibility proofs 
manifold: the noise eventually drives the state outside any bounded set no matter how 
large the latter is taken to be, and employing any standard linear state feedback means 
that any hard bounds on the control inputs will eventually be violated. 

In this article we propose a solution to the general receding horizon control problem 
for linear systems with noisy process dynamics, imperfect state information, and bounded 
control inputs. Both the process and measurement noise sequences are assumed to enter 
the system in an additive fashion, and we require that the designed control policies satisfy 
hard bounds. Periodically at times t = 0, N c , 2N C , • • • , where N c is the control horizon, a 
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certain finite-horizon optimal control problem is solved for a prediction horizon N. The 
underlying cost is the standard expectation of the sum of cost-per-state functions that 
are quadratic in the state and control inputs [Bertsekas, 2005]. We also impose some 
variance-like bounds on the predicted future states and inputs — this is one possible way 
to impose soft state-constraints that are in spirit similar to integrated chance-constraints, 
e.g., in [Klein Hanevcld, 1983; Klein Haneveld and van der Vlerk, 2006]. 

There are several key challenges that are inherent to our setup. First, since state- 
information is incomplete and corrupt, the need for a filter to estimate the state from 
the corrupt state measurements naturally presents itself. Second, it is clear that in the 
presence of unbounded (e.g., Gaussian) noise, it is not in general possible to ensure any 
bound on the control values with linear state feedback alone (assuming complete state 
information is available) — the additive nature of the noise ensures that the states exit 
from any fixed bounded set at some time almost surely. We see at once that nonlinear 
feedback is essential, and this issue is further complicated by the fact that only incomplete 
and imperfect state-information is available. Third, it is unclear whether applicatino of 
the control policies stabilizes the system in any reasonable sense. 

In the backdrop of these challenges, let us mention the main contributions of this article. 
We 

(i) provide a tractable method for designing, in a receding horizon fashion, bounded 
causal nonlinear feedback policies, and 

(ii) guarantee that applying the designed control policies in a receding horizon fashion 
renders the state of the closed-loop system mean-square bounded for any initial state 
statistics. 

We elaborate on the two points above: 

(i) Tractability: Given a subclass of causal policies, we demonstrate that the underlying 
optimal control problem translates to a convex optimization problem to be solved 
every N c time steps, that this optimization problem is feasible for any statistics of the 
initial state, and equivalent to a quadratic program. The subclass of polices we employ 
is comprised of open-loop terms and nonlinear feedback from the innovations. Under 
the assumption that the process and measurement noise is Gaussian, (even though 
the bounded inputs requirement makes the problem inherently nonlinear and it may 
appear that standard Kalman filtering may not apply,) it turns out that Kalman 
filtering techniques can indeed be utilized. We provide a low-complexity algorithm 
(essentially similar to standard Kalman filtering) for updating this conditional density, 
and report tractable solutions for the off-line computation of the time-dependent 
optimization parameters. 

(ii) Stability: It is well known that for noise-free discrete-time linear systems it is not 
possible to globally asymptotically stabilize the system, if the usual matrix A has 
unstable eigenvalues, see, for example, [Yang et al., 1997] and references therein. 
Moreover, in the presence of stochastic process noise the hope for achieving asymptotic 
stability is obviously not realistic. Therefore, we naturally relax the notion of stability 
into mean-square boundedness of the state and impose the extra conditions that 
the system matrix A is Lyapunov (or neutrally) stable and that the pair (A, B) is 
stabilizable. Under such standard assumptions, we show that it is possible to augment 
the optimization problem with an extra stability constraint, and, consequently, that 
the successive application of our resulting policies renders the state of the overall 
system (plant and estimator) mean-square bounded. 

Related Works. The research on stochastic receding horizon control is broadly subdi- 
vided into two parallel lines: the first treats multiplicative noise that enters the state 
equations, and the second caters to additive noise. The case of multiplicative noise has 
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been treated in some detail in [Cannon et al., 2009; Primbs, 2007; Primbs and Sung, 2009], 
where the authors consider also soft constraints on the state and control input. However, 
in this article we exclusively focus on the case of additive noise. 

The approach proposed in this article stems from and generalizes the idea of affinc 
parametrization of control policies for finite-horizon linear quadratic problems proposed 
in [Ben-Tal et al., 2006, 2004], utilized within the robust MPC framework in [Ben-Tal et al., 
2006; Goulart et al., 2006; Lofbcrg, 2003] for full state feedback, and in [van Hessem and 
Bosgra, 2003] for output feedback with Gaussian state and measurement noise inputs. 
More recently, this affine approximation was utilized in [Skaf and Boyd, 2009] for both 
the robust deterministic and the stochastic setups in the absence of control bounds, and 
optimality of the affine policies in the scalar deterministic case was reported in [Bcrtsimas 
et al., 2009]. In [Bcrtsimas and Brown, 2007] the authors reformulate the stochastic 
programming problem as a deterministic one with bounded noise and solve a robust 
optimization problem over a finite horizon, followed by estimating the performance when 
the noise can take unbounded values, i.e., when the noise is unbounded, but takes high 
values with low probability. Similar approach was utilized in [Oldcwurtcl et al., 2008] 
as well. There are also other approaches, e.g., those employing randomized algorithms 
as in [Batina, 2004; Blackmorc and Williams, 2007; Maciejowski et al., 2005]. Results 
on obtaining lower bounds on the value functions of the stochastic optimization problem 
have been recently reported in [Wang and Boyd, 2009]. 

Organization. The remainder of this article is organized as follows. In Section 2 we 
formulate the receding horizon control problem with all the underlying assumptions, the 
construction of the estimator, and the main optimization problems to be solved. In Section 
3 we provide the main results pertaining to tractability of the optimization problems and 
mean-square boundedness of the closed-loop system. We comment on the obtained results 
and provide some extensions in Section 4. We provide all the needed preliminary results, 
derivations, and proofs in Section 5, and some numerical examples in Section 6. Finally, 
we conclude in Section 7. 

Notation. Let (f2, J, P) be a general probability space, we denote the conditional ex- 
pectation given the sub-cr algebra of # as Ey[.]. For any random vector s we let 
S s := E[ss T }. Hereafter we let N+ := {1, 2, . . .} and N := N + U {0} be the sets of positive 
and nonnegative integers, respectively. We let tr(-) denote the trace of a square matrix, 
|| ■ || p denote the standard i v norm, and simply ||.|| denote the £ 2 -norm. In a Euclidean 
space we denote by B r the closed Euclidean ball or radius r centered at the origin. For 
any two matrices A and B of compatible dimensions, we denote by 9\k(A,B) the fc-th 
step reachability matrix Ui^A, B) := \A k ~ 1 B ■ ■ ■ AB B] . For any matrix M, we let 
cr m i n and <r max be its minimal and maximal singular values, respectively. We let (M) il . i2 
denote the sub-matrix obtained by selecting the rows i\ through 12 of M, and let (M), 
denote its i-th row. 

2. Problem Setup 

We are given a certain plant with discrete-time dynamics, process noise wt, and im- 
perfect state measurements y t tainted through noise v t (Figure 1). The objective is to 
design a receding horizon controller which renders the overall closed-loop system mean- 
square bounded. We discuss the structure and the main assumptions on the dynamics of 
the system, the performance index to be minimized, and the construction of the optimal 
mean-square estimator x t in Subsection 2.1. Then, we formalize the optimization problem 
to be solved in a receding horizon fashion (to generate the input policies u t ) in Subsection 
2.2. 
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Figure 1 . Main setup 



2.1. Dynamics, Performance Index and Estimator. Consider the following affine 
discrete-time stochastic dynamical model: 

xt+i = Ax t + Bu t + w t , (2.1a) 

y t = Cx t + v t (2.1b) 

where t € N, x t £ E™ is the state, ut G M m is the control input, y t G K p is the output, 
uit G K™ is a random process noise, vt 6 K ? is a random measurement noise, and A, B, 
and C are known matrices. We posit the following main assumption: 

Assumption 1. 

(i) The system matrices in (2.1a) satisfy the following: 

(a) The pair (A, B) is stabilizable. 

(b) A is discrete-time Lyapunov stable [Bernstein, 2009, Chapter 12], i.e., the eigen- 
values {Xi(A) | i = 1, . . . , d} lie in the closed unit disc, and those eigenvalues 
Xj(A) with |Aj(A)| = 1 have equal algebraic and geometric multiplicities. 

(ii) The initial condition and the process and measurement noise vectors are normally 
distributed, i.e., Xq ~ 7V(0, £ Xo ), w t ~ Af(0,Yl w ), and ~ ^(0,2^). Moreover, Xq, 
(w t )t£N an( i (^t)teN are mutually independent. 

(iii) The control inputs take values in the control constraint set: 

U:={eeM m | HeiL ^C/ max }; (CI) 
i.e., u t G U for each t G N. <} 

Without any loss of generality, we also assume that A is in real Jordan canonical 
form. Indeed, given a linear system described by system matrices (A, B) , there exists 
a coordinate transformation in the state-space that brings the pair (A, Bj to the pair 
(A, B), where A is in real Jordan form [Horn and Johnson, 1990, p. 150]. In particular, 
choosing a suitable ordering of the Jordan blocks, we can ensure that the pair (A, B) has 



the form 



Ax 






A 2 



Bi 
B 2 



, where A\ € 



is Schur stable, and A 2 € 



has 



its eigenvalues on the unit circle. By hypothesis (i)-(b) of Assumption 1, A 2 is therefore 
block-diagonal with elements on the diagonal being either ±1 or 2 x 2 rotation matrices. As 
a consequence, A 2 is orthogonal. Moreover, since (A,B) is stabilizable, the pair (A 2 ,B 2 ) 
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must be reachable in a number of steps k ^ n 2 that depends on the dimension of A 2 
and the structure of (A 2 ,B 2 ), i.e., rank(9i K (A 2 , B 2 )) = n 2 . Summing up, we can start by 
considering that the state equation (2.1a) has the form 



(2.2) 



where A-± is Schur stable, A 2 is orthogonal, and there exists a nonnegative integer k ^ n 2 
such that the subsystem (A 2 , B 2 ) is reachable in k steps. This integer n is fixed throughout 
the rest of the article. 

Fix a prediction horizon N G N + , and let us consider, at any time t, the cost V t ■= 
V(t,y t ,u) defined by: 

'JV-l 



" (!) " 

x t+l 




A lX [ iy 










T (2) 




A 2 xf\ 
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. B 2_ 


u t + 


3« 



V t =E 



^2{ x J+kQkX t +k + uJ +k R k u t+k ) + xJ +N Q N x t+N 



k=0 



(2.3) 



where = {yo, Ui, ■ ■ ■ ,Ut} is the set of outputs up to time t, (or more precisely the a- 
algebra generated by the set of outputs,) and Q k = Q~[ > 0, Qn = On > 0, and Rk = 
RJ, > are some given symmetric matrices of appropriate dimension, k = 0, . . . , N— 1. At 
each time instant t, we are interested in minimizing (2.3) over the class of causal output 
feedback strategies Q defined as: 

9t+\{yt+i) 



hi 

Ut+l 



(2.4) 



{gt 



gt+N-i{yt+N-i) 

■ ■ ■ ,gt+N-i} which satisfy the hard constraint 



Ut+N- 

for some measurable functions g f 
(CI) on the inputs. 

The cost Vt in (2.3) is a conditional expectation given the observations up through time 
t, and as such requires the conditional density f(xt\yt) of the state given the previous 
and current measurements. Our choice of causal control policies in (2.4) motivates us to 
rewrite the cost V t in (2.3) as: 



V t =E 



y t 



IE 



{y t ,x t } 



{ x J+kQkX t +k + uJ +k R k U t+ k) + xJ +N Q N X t+ N 



L k=0 



(2.5) 



The propagation of f{x t \yt) in time is not a trivial task in general. In what follows we 
shall report an iterative method for the computation of f(x t \yt), whenever the control 
input is a measurable deterministic feedback from the past and present outputs. For 
t,s e N , define x t \ s = Ey s [x t ] and P t \ s = Ey s [(x t - x t \ s )(x t - x t | s ) T ]. 

Assumption 2. We require that T, w > and > 0. <^> 

The following result is a slight extension of the usual Kalman Filter for which a proof 
can be found in Appendix A. An alternative proof can also be found in [Kumar and 
Varaiya, 1986, p. 102]. 

Proposition 3. Under Assumption l-(ii) and Assumption 2, f(x t \yt) and f{x t +\ |3^t) 
the probability densities of Gaussian distributions Af(x t \ t , P t \ t ) and Af(x t+1 \ t , P t+ i\ t ), 



respectively, with P t \ t > and Pt+i\t > 0- For t = 0, 1, 2, 
covariances can be computed iteratively as follows: 



H\u^t\t) 
. . , their conditional means and 



x t+i\t+i — x t+i\t 

+ P t+llt C T (CP t+1]t C J + S v )-\y t+1 



Cx t 



(2.6) 



(i 
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Pt+i\t+i — Pt+x\t — Pt+i\tC J (C P t+ i\ t C T + E„) 1 CP t+ i\ t , (2.7) 

where 

x t+llt =Ax tlt + Bg t (y t ), (2.8) 

Pt+x\t = AP t \ t A T + E w . (2.9) 

In particular, the matrix P t \ t together with i t | t characterize the conditional density 
f(x t \y t ), which is needed in the computation of the cost (2.5). Proposition 3 states that 
the conditional mean and covariances of Xt can be propagated by an iterative algorithm 
which resembles the Kalman filter. Since the system is generally nonlinear due to the 
function g being nonlinear, we cannot assume that the probability distributions in the 
problem are Gaussian (in fact, the a priori distributions of Xt and of y t are not) and the 
proof cannot be developed in the standard linear framework of the Kalman filter. 

Hereafter, we shall denote for notational convenience by Xt the estimate x t | t , and let 



x t 



which corresponds to the Jordan decomposition in (2.2). 



(2.10) 



2.2. Optimization Problem and Control Policies. Having discussed the iterative 
update of the laws of Xt given the measurements J^t in Section 2.1, we return to our 
optimization problem in (2.5). We can rewrite our optimization problem as: 



mm-; 

St 



injVf dynamics (2. la)-(2. lb) and constraint (CI) | , (OPl) 



where the collection of functions g< were defined following (2.4). 

The explicit solution via dynamic programming to Problem (OPl) over the class of 
causal feedback policies G, is difficult if not impossible to obtain in general [Bcrtsekas, 
2000, 2007]. One way to circumvent this difficulty is to restrict our search for feedback 
policies to a subclass of Q. This will result in a suboptimal solution to our problem, but 
may yield a tractable optimization problem. This is the track we pursue next. 

Given a control horizon N c and a prediction horizon N (^ N c ), we would like to 
periodically solve Problem (OPl) at times t — 0, N c , 2N C , ■ ■ ■ over the following class of 
affinc-like control policies 



ui = m + ^iViiVi - Vi) for aU £ = t, • • • , t + iV - l, (POL) 



where yi = C£i, and for any vector z G M p , <Pi{z) [<pi.i(zi), . . . , (pi yP (z p )~\ T , where zj 

is the j-th element of the vector z and tpi j : K — > K is any function with sup (^^(s)! ^ 

sew 



fmax < oo. The feedback gains 6*^ g ]g mx P and the affine terms r/e & R m are the decision 
variables. The value of ui in (POL) depends on the values of the measured outputs from 
the beginning of the prediction horizon at time t up to time i only, which requires a finite 
amount of memory. Note that we have chosen to saturate the measurements we obtain 
from the vectors (yi — iji) before inserting them into the control policy. This way we 
do not assume that either the process noise or the measurement noise distributions are 
defined over a compact domain, which is a crucial assumption in robust MPC approaches 
[Maync ct al., 2000]. Moreover, the choice of element-wise saturation functions (fi(-) is left 
open. As such, we can accommodate standard saturation, piecewise linear, and sigmoidal 
functions, to name a few. 
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Finally, we collect all the ingredients of the stochastic receding horizon control problem 
in Algorithm 1 below. 



Algorithm 1 Basic Stochastic Receding Horizon Algorithm 

Require: density f(x \y-i) =jV(0,£ a . o ) 
1: set t = 0, £o|-i = 0, and Po\-i — ^x 

2: loop 

3: for i = to N c - 1 do 
4: measure yt+i 

5: update x t +i(= x t +i\t+i) and P t +i\t+i using (2.6)-(2.7) 
6: if i = then 

7: solve the optimization problem (OP1) with the control policies in (POL) to 

obtain {u* t , ■■■ , u* t+N _ 1 } 
8: end if 

9: apply u* t+l 

10: calculate x t+i+ i\t+i and P t+i+1 \ t+i using (2.8)-(2.9) 

11: end for 

12: set t = t + N c 

13: end loop 



Compact Notation. In order to state the main results in Section 3, it is convenient to 
utilize a more compact notation that describes the evolution of all signals over the entire 
prediction horizon N. 

The evolution of the system dynamics (2.1a)-(2.1b) over a single prediction horizon N, 
starting at t, can be described in a lifted form as follows: 



X t = Ax t + BU t 
Y t = CX t + V t 



■vw t 



(2.11) 





' Xt ' 












' Vt - 












ut+i 




w t +i 




Vt+i 




Vt+1 


where X t = 




, u t = 




, w t = 




, Y t = 




, v t = 






_ Xt + N _ 




_ ut+N-1 _ 




_ u>t+N—l _ 




_Vt + N _ 




_ Vt+N _ 



A = 



i 

A 



B 



o 

B 

AB B 

t N-l , 



'• 

AB B 



r o 
i 

A 



'• 

A I 



• A B 

and C — diag{C, • ■ • , C}. Let 

K t := (AP t \ t A T + Z w )C T (C(AP tlt A T + ^ W )C J + E^)" 1 

be the usual Kalman gain and define Ft := I — K t C, and <&t := T t A. Then, we can write 
the estimation error vector as 



E t := X t -X t = Tte t + T™W t - F v t V u 



(2.12) 



8 



P. HOKAYEM, E. CINQUEMANI, D. CHATTERJEE, F. RAMPONI, AND J. LYGEROS 



where e t = x t — it , J-f 



i 



t+N—l ' *t + iV-2 





r t 



7? 



*t + JV-2 ■ 1 1 *t + lTt 


A't 



*t + JV-2 ■ ' ' *t + l-ft 
* t + JV-l ■ ' • *t + l-fft 



r t + jV-2 









r t + JV-l 









Kt + N-1 



*t + iV--l Kt + N-2 

Finally, the innovations sequence can be written as 

Y t -Y t = CTte t + CT^W t + (I - CJ^)V t , (2.13) 

where Y t '■— CX t . Consequently, the innovations sequence over the prediction horizon is 
independent of the inputs vector U t . 

The cost function (2.3) at time t can be written as 

V t = E Vt [xjQX t + UjKU t ] , (2.14) 

where Q — diag{Qoj • • ■ ,Qn} and 1Z = diag{i?o, • • ■ ,Rn-i}- Also, the control policy 
(POL) at time t is given by 

U t = Vt + & t f(Y t - Y t ), (POL) 

where 



Vt- 



Vt 



MYt-Yt): 



Jjt+N-l 

and © t has the following lower block diagonal structure 

9 t .t 



fo(yt-yt) 



v>N-\{yt+N-x—yt+N-i). 



H+l,t 



't+l,t+l 



?t+N-l,t "t+N-l,t+l 





%+N—l,t+N—l_ 



(2.15) 



Since the innovation vector Y t — Y t in (2.13) is not a function of rj t and & t , the control 
inputs Ut in (POL) remain affine in the decision variables. This fact is important to show 
convexity of the optimization problem, as will be seen in the next section. Finally, the 
constraint (CI) can be rewritten as: 



U t G U := G 



«JVr, 



IIClloo < 



(CI) 



3. Main Results 



The optimization problem (OP1) we have stated so far, even if it is successively feasible, 
does not in general guarantee stability of the resulting receding horizon controller in 
Algorithm 1. Unlike deterministic stability arguments utilized in MPC (see, for example, 
[Maync et al., 2000]), we cannot assume the existence of a final region that is control 
positively invariant and which is crucial to establish stability. This is simply due to the 
fact that the process noise sequence is not assumed to have a compact domain of support. 
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However, guided by our earlier results in [Ramponi et al., 2009], we shall enforce an 
extra stability constraint which, if feasible, can render the state of the closed-loop system 
mean-square bounded. 

For t € N, the state estimate at time t + K given the state estimate at time t, the control 
inputs from time t through t + k — 1, and the corresponding process and measurement 
noise sequences can be written as 



x t+K[t+K = A K x tlt + <R K (A,B) 



ut 



where we define 



St := [A^KtCA^^Kt+.CA,--- ,K t+K _ x CA] 
+ [A K ~ 1 K t C, A K - 2 K t+1 C, ■■■ , K t+K _tC] 



Wi 



UH+K-i 



(3.1) 



[A"- l K tl A«-*Kt+i,-~ ,K t+K -i] 



v t +i 



Scholium 4. There exists an integer T and a positive constant £ = CC^v, S w , A, B, C) 
such that 

EyJlStllXC for allt^T. (3.2) 

A proof of Scholium 4 appears in Section 5. Using the constant £, we now require 
the following "drift condition" to be satisfied: for any given e > and for every t = 
0, N c , 2N C , • • • , there exists a Ut € U, such that the following condition is satisfied 



A^x[ 2) +m K (A 2 ,B 2 ) 



< \\x 



(2)1 



(C 



1 1 (2) 1 1 

whenever ^ ( + e. (C2) 



Note that 



= (Vt) 



(®t)v.Km i p(y — Y). (For notational convenience, we 



have retained ip(Y — Y) with the knowledge that the matrix (&t)i:Km causally selects the 
outputs as they become available, see (2.15).) The constraint (C2) pertains only to the 
second subsystem of the estimator, as the first subsystem (i^) is Schur stable (see (2.2) 
and (2.10)). We augment problem (OP1) with the stability constraint (C2) to obtain 

minjVt dynamics (2.1a)-(2.1b), policies (POL), and constraints (CI) & (C2)|. 

(OP2) 

Assumption 5. In addition to Assumption 1 and Assumption 2, we stipulate that: 

(i) The control authority U max ^ C/^ ax , where [/* lax := o- min ^f\ K (A 2 ,B 2 )y 1 {C + §)■ 

(ii) The control horizon N c ^ k, where k is the reachability index. 

(iii) (A, si, ) is stabilizable, and (A, C) is observable. § 



in 
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Theorem 6 (Main Result). Consider the system (2.1a)-(2.1b), and suppose that Assump- 
tion 5 holds. Then: 

(i) For every time t = 0, N c , 2N C , ■ ■ ■ , the optimization problem (OP2) is convex, feasible, 
and can be approximated to the following quadratic optimization problem: 

minimize 2xjA T QBr] t + 2tr(A T QB@ t Af x ) + 2tr(&J B T QVA^) 

(Vt, & t) \ / 

+ vjM!T] t + 2t 1 ]M 1 ® t A? + tr(@J Mx&tA^) (3.3) 
subject to the structure of & t in (2.15), 

\(r)t)i\ + \\(®t)i\\ 1 <Pm a x<U max Vi = l,---,Nm, (3.4) 

+m K (A 2 ,B 2 )(r 1t ) 1:Km + ^n 2 \\<R K (A 2 ,B 2 )(® t ) 1 



(3.6) 



< ||^ 2) || - (C+ |) whenever \\x ( t 2) || > ( + e (3.5) 

where Mi=TZ + B J QB, 

A? = Ey t MY Y)], Af x = Ey t MY - Y)xJ], 

Ar = Ey t [W^{Y Y) r ], Ar = E Vt MY Y)cp(Y Y) J ] . 

(ii) For every initial state and noise statistics [(xo, S xo ),T, W , E„) , successive application 
of the control laws given by the optimization problem in (i) for N c steps renders the 
closed-loop system mean-square bounded in the following sense: there exists a constant 
7 = 7(^0, K,io,Si„, t/D >0 such that 

sup Ey [\\x t \\ 2 ] < 7 . (3.7) 

ten 

In practice, it may be also of interest to impose further some soft constraints both on 
the state and the input vectors. For example, one may be interested in imposing quadratic 
or linear constraints on the state, both of which can be captured in the following 

Ey t [xJSX t +C T X t ]^a t , (C3) 

where S = S T ^ and at > 0. Moreover, expected energy expenditure constraints can 
be posed as follows 

Ey t [uJSU t ] < ft, (C4) 

where S = S T and f3 t > 0. In the absence of hard input constraints, such expectation- 
type constraints are commonly used in the stochastic MPC [Agarwal ct al., 2009; Primbs 
and Sung, 2009] and in stochastic optimization in the form of integrated chance con- 
straints [Klein Haneveld, 1983; Klein Haneveld and van der Vlerk, 2006]. This is partly 
because it is not possible, without posing further restrictions on the boundedness of the 
process noise wt , to ensure that hard constraints on the state are satisfied. For example, in 
the standard LQG setting nontrivial hard constraints on the system state would generally 
be violated with nonzero probability. Moreover, in contrast to chance constraints where a 
bound is imposed on the probability of constraint violation, expectation-type constraints 
tend to give rise to convex optimization problems under weak assumptions [Agarwal ct al., 
2009]. 

We can also augment problem (OP2) with the constraints (C3) and (C4) to obtain 

mmjVt dynamics (2. la)- (2. lb), policies (POL), constraints (CI), (C2), (C3) & (C4)|. 

(OP3) 

Notice that the constraints (C3) and (C4) are not necessarily feasible at time t for any 
choice of parameters a t and j3 t . As such, problem (OP3) may become infeasible over time if 
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we simply apply Algorithm 1. We therefore replace the optimization Step 7 in Algorithm 1 
with the following subroutine for some given a* t and /3 t * that make the constraints feasible, 
precision number 8 > and maximal number of iterations v: 



Subroutine 7 

7a: Set a = a* t , a = 0, /3 = /3 t * , and (3 = 

7b: Solve the optimization problem (OP3) using at = a and (3 t = (3 to obtain the 

sequence {u* t ,u* t+1 , ■ • • , Mt +A ,_ 1 } 
7c: Set v = 1 
7d: repeat 

7e: Set a t = (a + a) /2 and & = + /?) /2 

7f: Solve Solve the optimization problem (OP3) using the new at and (3t to obtain 

the sequence {u t ,u t +i, ■ ■ ■ ,u t +N-i} 
7g: if step 5f is feasible then 
7h: Set a — a t and (3 = f3 t 

7i: Set {u*,u* +1 , ■ ■ ■ ,u* +N _ 1 } = {u t ,u t +i, • ■ ■ ,"t+w-i} 

7j : else 

7k: Set a = a t and f3 — (3 t 

71: end if 
7m: set v = v + 1 

7n: until (|a — a| ^ <5 and |/3 — (3\ ^ <5) or v > v 



It may be argued that replacing Step 7 in Algorithm 1 with Subroutine 7 above increases 
the computational burden; however, the parameter v guarantees that this iterative process 
is halted after some prespecified number of steps in case the required precision 6 is not 
reached in the meantime. In some instances, we may be given some fixed parameters a 
and (3 with which the constraints (C3) and (C4) have to be satisfied, respectively. This 
requirement can be easily incorporated into Subroutine 7 by setting a = a and (3 = (3 in 
step 7a. 



Assumption 7. At each time step t — 0, N c , 2N C , 
tine 7 are chosen as 



a* t := 3tr(A T SAEy t [x t xj] +V T SVY, % 
j3* t := iVma max (5)L^ ax . 



, the constants a* t and ffi in Subrou- 
3Nma max (B T SB)U^ + C T Ax t + \\C T B\\U n 



Corollary 8. Consider the system (2. la) -(2. lb), and suppose that Assumption 5, and 7 
hold. Then: 

(i) For every time t = 0, N c , 2N C , ■ ■ ■ the optimization problem (OP3) solved in Subrou- 
tine 7 is convex, feasible, and equivalent to the following quadratically constrained 
quadratic optimization problem: 

minimize 2xJ A T QBr] f + 2tr(A T QB@ t Af x ) + 2trf ®JB T QT>A™ V 

+ qjMiVt + 2r]jMi&tAf + trfejMi&tAf^ 
subject to the structure of &t in (2.15), 



\(Vt 



(St 



i II 1 ^max : 

d\ K (A 2 ,B 2 )( Vt ) 1 



Vt = 1. 



, Nm, 



n 2 ||£H re (j4 2 , B 2 )(& t )i •.Km 1 1 oo rmax 



s$ \\x 



(2)1 



II ( 2) 1 1 

(C + §) whenever \\x t ^ £ + e, Vi 



,n 2 
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r]jB T SBT] t + 2r]jB T SB®tK + tr(®jB T SB® t A? v 
+ 2x T t A T SBr, t + 2tr(A T SB® t A? x + ®]B T SVA™^ 
+ C T B(r lt + ® t Af) + tr(A T SAEy t [x t xj]) + tr(V T SVX w ) 
+ C T Ax t < a t , (3.8) 

rjJSrjt + 2 V JS® t A? + tr (®JS® t Af* ) < fit- (3.9) 

(ii) For every initial state and noise statistics ((xo, £a; ), "£< Wl £„) , successive application 
of the control laws given by the optimization problem in (i) for N c steps renders the 
closed-loop system mean-square bounded in the following sense: there exists a constant 
7 = 7(3 7 o,K,^o,S Xo ,E to ,E l ,,f7^ ax ) > such that 

SU P E^cDl^l! 2 ] ^ 7- 
teN 



4. Discussion 

The optimization problem (OP2) being solved in Theorem 6 is a quadratic program 
(QP), whereas the optimization problem (OP3) being solved in Corollary (8) is a quadrat- 
ically constrained quadratic program (QCQP) in the optimization parameters (77,0). As 
such, both can be easily solved via software packages such as cvx [Grant and Boyd, 2000] 
or yalmip [Lofberg, 2004]. 

4.1. Other Constraints and Policies. It is not difficult to see that constraints on the 
variation of the inputs of the form 

llPttlL ^ At/ max , 



where P = 



1 -1 
/ -/ 



, can be incorporated into the optimization problems 



.()•■■ 1 — j_ 

(OP2) and (OP3). Moreover, we can easily solve the problem using quadratic policies of 
the form 



U t = r, + &<p(Y t - Y t ) + &<p(Y t - F t ), 



(POL') 



instead of (POL), where 



6 



'JV-1,0 vn-i,i 



ip(z) 



vo(yt - yt) T vo(yt - yt) 



■PN-l(Vt+N-l - St+N—l) PN-llVt+N — l — Vt + N-l)- 



and 6ij € ]R mxl . The underlying optimization problems (OP2) and (OP3) with the policy 
(POL') are still convex and both Theorem (6) and Corollary (8) still apply with minor 
changes. 

4.2. Off-Line Computation of the A Matrices. At any time t = 0, A^, 2A C , • • • , our 
ability to solve the optimization problems (OP2) and (OP3) in Theorem 6 and Corollary 
8, respectively, hinges upon our ability to compute the following matrices 



AT = EyMYt - Y t )(x t - x t ) T ], Ar = Ey t [W^(Y t - Y t ) T ], 
Af = Ey t MY - Y t )], AT = Ey t MY Y t )ip(Y t Y t ) T ] 

Af x = Af e + AfxJ. 



(4.1) 
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Recall that Y t — Y t is the innovations sequence that was given in (2.13), and x t is the 
optimal mean-square estimate of x t given the history y t . The matrices (4.1) may be 
computed by numerical integration with respect to the independent Gaussian measures 
of Wt, . . ■ Wt+N-i, of vt, ■ ■ . Ut+jV) and of Xt given y t . Due to the large dimensionality of 
the integration space, this approach may be impractical for online computations. One 
alternative approach relies on the observation that Af e , Af , A™ v , and Af v depend on x t 
via the difference x t — x t ■ Since x t — x t is conditionally zero- mean given y , we can write 
the dependency of (4.1) on the time- varying statistics of Xt given y l as follows: 

Af (*t, Pt\t) = K e (Pt\t) + Af(Pt\t)xJ, 

A? v (P t \t), and AT(Pt\t)- ' 

In principle one may compute off-line and store the matrices A|f e (P t | t ), Af(P t u), A t (P t ij), 
and Af v (P t \ t ), which depend on the covariance matrices P t | t , and just update online the 
value of Af x (x t ,P t \t) as the estimate x t becomes available. However, this poses serious 
requirements in terms of memory. A more appealing alternative is to exploit the conver- 
gence properties of P t \t- The following result can be inferred, for instance, from [Kamen 
and Su, 1999, Theorem 5.1]. 

Proposition 9. Under Assumptions 2 and 5- (Hi) 

• the (discrete-time) algebraic Riccati equation in P € K nx ™ 

P = A[P- PC T (CPC T + Y, V )- 1 CP]A T + £ w (4.3) 

has a unique solution P* ; and 

• the sequence Pt+l|i defined by (2.7) and (2.9) converges to P* as t tends to oo, 
for any initial condition fbl-i ? 0- 

The assumption that > can be relaxed to £„ ^ at the price of some additional 
technicality (more on this can be found in [Ferrante et al., 2002]). As a consequence of 
this result, under detectability and stabilizability assumptions, P t u converges to 

P° = P* - P*C T (CP*C T + S„)- 1 CP*, (4.4) 

which is the asymptotic error covariance matrix of the estimator xt- Thus, neglect- 
ing the initial transient, it makes sense to just compute off-line and store the matrices 
Af e {P°),Af(P°), A^(P°), and Af 1/5 (P°), and just update the matrix Af x for new values 
of the estimate i t . 

5. Proofs 

The proofs of the main results are presented as follows: We begin by showing the 
result in Scholium 4. Then, we state a fundamental result pertaining to mean-square 
boundedness for general random sequences in Proposition 11, which is utilized to show 
the mean-square boundedness conclusions of Theorem 6 and Corollary 8. We proceed 
to show the first assertion (i) of Theorem 6 in Lemma 12. The proof of the second 
assertion (ii) of Theorem 6 starts by showing Lemmas 13 and 14 to conclude mean-square 
boundedness of the orthogonal subsystem (A 2 ,B 2 ). We conclude the proof of Theorem 6 
by showing mean-square boundedness of the Schur stable subsystem (^4i, Pi). We end this 
section by proving the extra conclusions of Corollary 8, beyond those present in Theorem 
6. 

Let us now look at the estimation equation Xt(— x t \t) in (2.6) and combine it with (2.8) 
and the output equation (2.1b) to obtain 

x t+1 = Ax t + Bu t + K t (CA{x t -x f )+Cw t + v t+1 ) , (5.1) 
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where K t = {AP t]t A J + £„,)C T (C{AP t]t A J + T, W )C T + X^)" 1 is the Kalman gain. Our 
next Fact pertains to the boundedness of the error covariance matrices P t \t in (2.7). 

Fact 10. There exists f £ N and p,p m > such that tr(P t | t ) ^ p for all t ^ T' , and 
\\K t \\ < p m for all t > T . 

Fact 10 follows, for example, immediately from Lemma 15 in Appendix B and since by as- 
sumption £„ > 0, one possible bound on (|| if t ||) t>T , is given by ||Kt|| < ll APt i^ +£™j|||c | 

(||£„,|| + ||A|| 2 ||P t i t ||) ||c T | 

^ - — A [ s ) — • Note that there are many alternative bounds in the Literature, 

see, for example, [Anderson and Moore, 1979; Jazwinski, 1970]. Now, using the bounds 
in Fact 10, we can proceed to prove Scholium 4. 

Proof of Scholium 4- Recall the expression of S t in (3.1) and define the following quanti- 
ties: 

F* Kt := \_A K ~ l K Kt CA ■ ■ ■ AK K(t+1) _ 2 CA K K{t+1) . x CA] , 

F% := [A^K Kt C ■ ■ ■ AK K{t+1) _ 2 C ^ re(t+1) _ 1 C] , (5.2) 

FZ t := [A K ~ 1 K Kt ■ ■ ■ AK <t+l) _ 2 ^ (t+ i)-i] . 
Then, S Kt can be rewritten as 

2 Kt = F , K t e Kt:K ( t+ i)-i + F™t w Kt: K (t+i)-i + F£ t v Kt+ i :K ( t+1 y (5.3) 

But for t > \T /k\ , 1 we have that 

\\F^\\^k\\CA\\ sup pQ||, 

\\F%\\^ K \\C\\ sup \\K t \\, 

\\F' V J\^ K sup \\K t \\. 
e^n\T' /k] 

Using Fact 10 we know that sup^, re r T / / re -i \\K(}\ ^ ph- Thus, it suffices to take 

C - K*l 2 Ph (\\CA\\ y/p + \\C\\ 7tr(SJ + v/tr(^)) 
in order to upper-bound the expectation of S(t) in (5.3) after time T := k\T' /k] . □ 
The following result pertains to mean-square boundedness of a random sequence (£t)*eN : 

Proposition 11 ([Pcmantle and Rosenthal, 1999, Theorem 1]). Let (£t)teN be a sequence 
of nonnegative random variables on some probability space (Q,§,¥), and let (5*)teN be 
any filtration to which (t;t)teN *s adapted. Suppose that there exist constants b > 0, and 
J, M < oo, such that 

and for all t G N: 

E[£t+i — £,t\dt] ^ —b on the event {£t > J}, and (5-4) 
E[l&+i-&| 4 |£o, •■•,&] <M. (5.5) 

Then there exists a constant 7 = 7(6, J, M) > such that supEf^l ^ 7. 

ten 

Lemma 12. Consider the system (2.1a)-(2.1b), and suppose that Assumption 5 holds. 
Then the first assertion (i) of Theorem 6 holds. 



Recall that for any positive real number s, \s\ denotes the smallest integer that upper-bounds s. 
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Proof. It is clear that XjQX t + UjlZU t is convex quadratic in X t and U t , and both X t 
and U t are afHne functions of the design parameters (rj t , t ) for every realization of the 
noise (wt)ten- Since taking expectation of a convex function retains convexity [Boyd and 
Vandenberghe, 2004], we conclude that K Xt [Xj QX t + UjlZUt] is convex in (r) t , t ). Also, 
note that the constraint sets described by (3.4) and (3.5) are convex in (ij t ,@t). This 
settles the claim concerning convexity of the optimization program in Theorem 6-(i). 
Concerning the objective function (3.3), we have that 

Ey t [xjQX t + UjTZU t ] 
= Ey t [{Ax t + BU t + VW t ) T Q(Ax t + BU t + VW t ) + UjKU t ] 
= E Vt [\\Ax t + B( Vt + &MY t - Y t )) + VW t \\ 2 Q + \\ Vt + ®MY t - Y t )\\^ 
= Ey t \xjA J QAx t + 2xjA T QBr] t + 2xJ A T QB& t ^(Y t - Y t ) + 2xjA T QVW t 
+ rjJ(B T QB + K) Vt + 2 V J(B T QB + K)® t <p(Y t - Y t ) + 2rjjB T QVW t 
+ ip{Y t - Y t ) T &J(B T QB + 1l)@MY t - Y t ) + 2<p(Y t - Y t ) T &J B T QVW t 
+ WjV T QVW t 

Since Ety ttXt \ \Wt\ = and in view of the definitions of the various matrices in (3.6) we 
have 

V t = 2x]A T QBr) t + 2tr(A T QB® t A? x ) + 2tr(@jB T QVA 

+ vjM^t + 2T)jMi® t Af + tr(ejMiB t At v 
+ terms that do not depend on (rj t , © t ). 

This tallies the objective function in (3.3) with the objective Vt in (2.14). 

Concerning the constraints, we have shown in [Chatterjee et al., 2009; Hokayem et al., 
2009] that combining the constraint H^tH^ < U max and the class of policies (POL) is 
equivalent to the constraints |(?7t)i| + ||(0*)i|li Vmax ^ U max for all i = 1, . . . ,Nm, which 
accounts for the constraint (3.4). Substituting (POL) into the stability constraint (C2), 
we obtain 



A^xf } + m K {A 2l B 2 )( Vt )i: Km + d\ K (A 2 , B 2 ){® t ) 1:Km <p(W) 

< A%x<t 2) +X K (A2,B 2 )(rit)i.. Km \\ +\\<R K (A 2 ,B 2 )(®t)i.. Km <p(W)\\ 
A K 2 xf ] +m K (A 2 ,B 2 ){ Vt ) 1 .. Km +V^\\M K (A 2 ,B 2 )(®t) 1:Km <p{W)\\ 



A 2 x\ 



dK K {A2,B 2 ){r, t ), 



^||5H K (A 2 ,B 2 )(0 t )i 



Accordingly, if condition constraint (3.5) is satisfied, then the stability constraint (C2) is 
satisfied as well. 

It remains to show that the constraints are simultaneously feasible. Inspired by the 
work in [Ramponi et al., 2009], we consider the candidate controller 



^ K (^2,S 2 ) T sat r (A^ 2) ), 



Ut:t+K-1 '■— 

JH+K-1. 

i.e., with f = 0, where r := £ + e/2. First, we have that 

Wut-.t+K-iWoo < \\ut:t+ K -i\\ 2 < CT mi „(^ K (^2,S 2 ))- 1 (C + e/2) 



U* 
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and the constraint (CI) is satisfied. Concerning constraint (C2), we have that 



^ 2) +JK«(A 2) B 2 )u t:t+B _ 1 



;,(2) 



;.(2) 



-(C + e/2), whenever ||^ 2) II ^ C + £, (5.6) 



where the first equality follows from the orthogonality of A 2 (see [Ramponi et al., 2009]), 
and the constraint (3.5) is also satisfied. The optimization program (3.3) subject to (3.4)- 
(3.5) is therefore a quadratic program that is equivalent to (OP2). □ 

Lemma 13. Consider the system (2.1a)-(2.1b) ; and suppose that Assumption 5 holds. 
Then there exist constants b, J > such that 2 



E 



|a( 2 ) 

r«(t+i)l 



i(2)l 



—b on the set {\\x^ t || J} f or a ^ t ^ \T'/k~\. 



Proof. Let IT^ 2 ^ be a projection operator that picks the last n 2 components of any vector 
in R n , and consider the subsampled process (x\J jt&n given by 



€(t+i) = A %*Kt + n K (A 2 ,B 2 )u Kt:K{t+1) _ 1 + n& (z Kt 
By utilizing the triangle inequality we get 



(5.7) 



(2) 

«(t+l) 



a(2) 



A K r {2) 



9^k(^2, 5 2 )li Kt:K ( t+ i)_i 



A(2) 



iK 2 >(s Kt ) 



We know from Scholium 4 that there exists a uniform (with respect to time t) upper bound 
£ for the last term on the right-hand side of the preceding inequality for t ^ k\T'/k\ . We 
rewrite the inequality as 



(2) 

k(*+1) 



;.(2) 



A!£x% + m K (A 2 , B 2 )u Kt ., K{t+1) _ x 
whenever ||ij 2 ' ) || ^ Q + e, 



2' 



l(2) 



+ c 



where the last inequality follows from (5.6). Setting b = e/2 and J = C + £ completes the 
proof. □ 

Lemma 14. Consider the system (2.1a)-(2.1b), and suppose that Assumption 5 holds. 
Then there exists a constant M > such that 



E 



~(2) 

C k(*+1) 



;.(2) 



, i=\T'/K\,...,t 



^ M 



for all t > \T'/k] . 



Proof. Fix t ^ [T'/k], and consider again the subsampled process in (5.7). We have, by 
orthogonality of A 2 , that 



A 2 x K t 



E 

= E 
E 
= E 



(2) 

K(t+1) 



A(2) 



x(2) 



and, by the triangle inequality, that 
i = \T'/ K ],...,t 



(2) 

«(t+l 



4«a(2) 

/1 2 X Kt 



l(2) 



A£x)J + Vi K (A 2l B 2 )u Kt . Mt +i)-i + S Ki - A£x 



(2) 



;.(2) 



\T'/K],...,t 



(\\?R-k(A 2 , B 2 )u Kt:K ( t+ i 



)-i| 



, *= [Ty*!,...,* 



For 26I the notation [2] stands for the smallest integer greater or equal to 2. 
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Recall that for any two positive numbers a and b, (a + b) 2 ^ 2a 2 + 2b 2 . Using this latter 
fact, we arrive at the following upper bound 

4 



E 



A(2) 

X K(t+1) 



;.( 2 ) 



;,(2) 



, i=\T>/K],. 



€ 8E 



\VH K (A 2 , B 2 )u K f. K (t+i)-i\ 



,(2) 



i = [T'/kI,. ..,t. 

i=\r/ K ],...,t] 



-(2) 



By design 1 1 1 1 ^ ^ U msx and ^ K t is Gaussian and independent of j 
and has its fourth moment bounded. Therefore, we can easily infer that there exists an 
M > such that 



E 



(||OT«(A 2 ,SaKi:«(t+i)-i|| + ||S«t| 
for all t ^ \T'/k\, 

We are finally ready to prove Theorem 6. 



,-.(2) 



[T7«l,. 



s$ M 



□ 



Proof of Theorem 6. Claim (i) of Theorem 6 was proved in Lemma 12. It remains to show 
claim (ii). We start by asserting the following inequality 



Ey 



[11**1 



sC 2E 



y. 



[\\xt 



We know from Fact 10 that E 



y, 



Xt 

2 



•2E-, 



[11**1 



It'/.] 



\%t — *t|| ] ^ P for all i ^ re[T'//c]. As such, 
if we are able to show that the state of the estimator is mean-square bounded, we can 
immediately infer a mean-square bound on the state of the plant. 

We first start by splitting the squared norm of the estimator state as ||it|| 2 = H^t^H + 
||x( 2 ' ) || 2 , where x^ and x^ are states corresponding to the Schur and orthogonal parts 
of the system, respectively It then follows that 



Ei 



Letting £ t 



[11**1 

«( 2 )ll 



E 



t e N. 



(5.8) 



for t € N, we see from Lemma 13 and Lemma 14 the conditions 



(5.4) and (5.5) of Proposition 11 are verified for the sequence (Ck*)*>[T'/k1 • Thus, by 
Proposition 11, there exists a constant 7^ = ^ 2 \b, J, M) > such that 



Ei 



[p* II 2 ] < 7 (2) for a11 1 > r T '/ K i- 



Hence, the orthogonal subsystem is mean-square bounded. 

Since the matrix A\ is Schur stable, we know [Bernstein, 2009, Proposition 11.10.5] 
that there exists a positive definite matrix M G IR" 1 *" 1 that satisfies AjMAi — M = —I. 
It easily follows that there exists p E ]0, 1[ such that AjMAi — M —pM; in fact, p 
can be chosen from the interval ] 0, minjl, A m i n (/)/A max (M) } [. Therefore, we see that 
for any t ^ T', 



E, 



a(1) 
x t+i 



,1/ 



2E yt [{&W) r AjMB lV P] +Ey t [n«(S t )], 



M 



(1)1 



where Ipi) is the projection onto the first n\ coordinates and St was defined in (3.1). 
Young's inequality shows that 2Ey t [(x { t 1} ) T A] M B^] ^ eEy t [\\ AxX^ f M ] +\Ey t [\\B X 
for e > 0. Choosing e = 3p/(2(l — p)) and utilizing Fact 10 and the upper bound on the 
inputs U max , it follows that for all t ^ T' , 



,(1)1 



E, 



-(1-f) 



*(i) 



2(1 ~g) 
3p 



A max (M)||i? 1 || 2 [/ 2 ax n 1 +C=: 
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Therefore, for t^T', we have that 

2 



E 



y, 



(1) 

k\T' /k]+2 



M 



E 



^(l-f) 2 

Iterating the last inequality, it follows that 

2 



2 



j KrT'/ K i+2 
(i-f) 



2 

M 

c (1) 



+ c 



+ (l + (l-f))c. 



E 



A/ 



<(!-§)* 



A/ 



or 



E, 



A/ 



a 



M 



t > T' . 



We can conclude that there exists 7*- 1 ) = 7 ( - 1 - > (io) > such that 



(5.9) 



We can therefore conclude that 



+ 7 l 



for all * ^ k|T7«1- 



Since the sequence (xt)t&i in (2.1a) is generated through the addition of independent 
mean-square bounded random variables and a bounded control input, and since k\T' / k] < 
oo, it follows that there exists 7 > such that 

E :vo[IM 2 ] < 7 for a11 t G N , 
establishing the second claim (ii) of Theorem 6. □ 

Proof of Corollary 8. The proof of Corollary 8 follows exactly the same reasoning as in 
the proof of Theorem 6, up to the constraints in (3.8) and (3.9). Also, rewriting the 
constraints (C3) and (C4) as (3.8) and (3.9), respectively, can be done similarly to the 
way we rewrote the cost in Theorem (G). It remains to show the upper bounds a* and j3* 
in Assumption (7). 

The constraint (C3) can be upper-bounded as follows: 

Ey t [XJSX t + C T X t ] 



\Ax t + BU t + VWt\\ 2 s + £ T (Ar t + BU t + VW t . 



\Ax 



tils 



WBUi 



t\\ s 



\vw t 



tils 



C T Ax t + \C T BU t \ 



^ 3tr(^ T 5^E^ t [x t xj] + V T SVZ W ) + C T Ax t 
Wma max {B T SB)U, 



2 

max 



lz* T Bll II 

I 111 max ! 



where the first inequality follows from the fact that (a + b + c) 2 ^ 3(a 2 + b 2 + c 2 ), for any 
a, b, c > 0, and the noise being zero-mean, the second inequality follows from applying 
norm bounds between the 2 and oo-norms and Holder's inequality. As for the constraint 
(C4), it can be upper-bounded as follows: 



Ei 



UjSUt 



< <r m ax(S) \\Ut\f < Nma max (S) ||tf t ||^ < Nma max (S)U- 



2 

max • 



This completes the proof. 



□ 
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6. Examples 

Consider the system (2.1a)-(2.1b) with the following matrices: 





"0.5 





" 




"1" 


A = 








-1 


■B = 










1 







1 



The simulation data was chosen to be Xq ~ jV(0, J), w t ~ Af (0,101), v t — 7V(0, 10/), 
Q = I, R = 1, N = 5, N c = K = 2, and y> the usual piecewise linear saturation function 
with <ys max = 1. For this example the theoretical bound on the input is ET^ax = 168.6783 
for a choice of e = 10. 

We simulated the system for the discrete-time interval [0, 200] using Algorithm 1, with- 
out the constraints (C3) and (C4). The coding was done using MATLAB and the opti- 
mization problem was solved using yalmip [Lofberg, 2004] and sdpt3 [Toh et al., 1999]. 
The computation of the matrices A^ e (P°), A^(P°), A^(P°), and A^(P°) was done 
off-line using the steady state error covariance matrix P°, as discussed in the previous 
section, via classical Monte Carlo integration [Robert and Casella, 2004, Section 3.2] using 
10 5 samples. 

The norms of the state trajectories for 200 different sample paths of the process and 
measurement noises are plotted in Figure 2 starting from x = [97.38 100.19 99.78] . 
The average state norm as well as the standard deviation of the state norm are depicted 
in Figure 3, where it is clear that the proposed controller renders the system mean-square 
bounded. The average total cost normalized by time for this simulation is plotted in 
Figure 4. 




Figure 2. State norm for 200 realizations of the process and measure- 
ment noise sequences 



7. Conclusions 

We presented a method for stochastic receding horizon control of discrete-time linear 
systems with process and measurement noise and bounded input policies. We showed that 
the optimization problem solved periodically is successively feasible and convex. Moreover, 
we illustrated how a certain stability condition can be utilized to show that the application 
of the receding horizon controller renders the state of the system mean-square bounded. 
We discussed how certain matrices in the cost function can be computed off-line and 
provided an example that illustrates our approach. 
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AVERAGE 

- - - STANDARD DEVIATION 




20 40 60 80 100 120 140 160 180 200 
TIME 



Figure 3. Average and standard deviation of the state norm for 200 
realizations of the process and measurement noise sequences 




20 40 60 80 100 120 140 160 180 200 
TIME 



Figure 4. Total average cost 
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Appendix A. Proof of Proposition 3 

For t = 0, xq A/"(xo|_i, -Po|-i)j w hh Pol— l > ®i by assumption. Assume now that, for 
a given t ^ 0, f(x t \y t _i) is normal with mean x t \ t _i and covariance matrix P t \t-i > 0. 
By Assumption l-(ii) and dynamics in (2.1b), f(yt\xt, D^t-i) is also normal with mean Cx t 
and covariance matrix T, v . Hence, applying Bayes rule we may write 

,, n f(yt\xt,yt-i)f(xt\yt-i) 

f{x t \y t ) = jj—rr: 7 , 

f{vt\yt-i) 

where we have = / f(yt\xt, yt) f (x t \yt-i)dx t by the Chapman-Kolmogorov 

equation. It follows that 

f(x t \y t ) « f(yt\x u y t -i)f(xt\yt-i) 



oc exp 



f-^ {yt - Cx t ) T E t , 1 (y t - Cx t ) + (x t - i t | t _ 1 ) T P 4|t 1 _ 1 (x t - x t \ t -i) 
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where the proportionality constants do not depend on x t . Let us now focus on the term 
within square brackets and write x, y, x and P in place of x t , yt, it|t-i an d Pt\t-i f° r 
shortness. Expanding the products and collecting the linear and quadratic terms in x one 
gets 

x T (C T E; 1 C + P~ l )x - 2x T (C T X~ 1 y + P^x) + (y T ^- 1 y + x T p- l x). 

Since P > and S„ > by assumption, it follows that C T 'E~ 1 C + P -1 > 0. If we let 
P* = (C T E~ 1 C + P -1 ) -1 , the expression above can be rewritten as 

(F- 1 x) T P >t (p- 1 x) - 2(p- 1 :r) T P t (C T 5]; 1 2 / + P~ x x) + (y^y + PP^x). 

Completing the square, the latter expression becomes 

[P^x - (C 7 ^ + P-^)] T P [P^x - {C 1 ^-^ + P- I x)] + c, 

where c depends on x and y but not on x. Factoring P~ out of the two terms P~ x — 
(C T T,~ 1 y + P _1 x) and simplifying yields 

[x - P.iC^y + P^x^p-^x - P^^y + P^x)} + c. 

Hence, 

f(x t \y t ) oc exp ^-^(^t - i*) T P ! T 1 (2 ; t ~ , 

with = P* (C T T,~ 1 y+P~ 1 x), where the missing normalization constant does not depend 
on x t . Hence, f(x t \yt) is normal with mean i t | t = and covariance matrix P t | t = P*. 
Using the matrix inversion lemma, P* = P - PC T (CPC T + E B ) _1 CP, which is (2.7). To 
obtain (2.6), replace y by (y — Cx) + Cx in the expression of cc* to get 

4, = p, [C T iV(y - G4) + C^E" 1 ^ + P^x] 

= P*C J ^- l (y - Cx) + P*{C T Y,- l C + P-^x 

= P*C~ r T,- 1 {y-Cx)+x. 

Finally, using the identity / = (CPC T + S„)(CPC T + S^) -1 (where the inverse exists 
because P ^ and > 0) and the definition of P*, 

P.C 7 ^- 1 = P*C T ^- 1 (CPC T + £,)(CPC* T + s.r 1 

= P^{C T Y,- 1 CPC T + C T ){CPC T + EJ- 1 

= P^C^E" 1 *? + P- 1 )PC T {CPC T + E,,)" 1 

= pc t (cpc t + Y, v y 1 , 

which leads to (2.6). 

To conclude the proof, we need to compute the density f(xt+i |3^t) and prove (2.8)-(2.9). 
Using the Chapman-Kolmogorov equation, we can compute the density as follows 

f{xt+x\y t ) = J f(x t+1 \x t ,y t )f(x t \y t )dx t . (a.i) 

However, the explicit computation of the integral in (A.l) is a very tedious exercise. We 
shall instead rely on characteristic functions. Recall that the characteristic function of 
an n-dimensional Gaussian random vector £ with mean /x and covariance E ^ is given 
by H>( z ) = E[exp(iz T £)] = exp(izV- ±z T Ez), where i := and z € R". The 

characteristic function of Xt+\ given y t is then 

=E[exp{iz T x t+1 )\y t ] 

= E [E[ exp (iz T {Ax t + Bu t + w t )) \ x t ,y t ] \y t ] 

= E[exp(iz T Ax t + iz T Bg t (y t )) x E [ cxp{iz T w t ) \x u y t ] \y t ] , 



24 



P. HOKAYEM, E. CINQUEMANI, D. CHATTERJEE, F. RAMPONI, AND J. LYGEROS 



where we have used system dynamics in (2.1a) and the general definition of the feedback 
policies (2.4). Since w t is independent of x t and y t and it is Gaussian with mean and 
covariance T, w , ¥.[exp(iz T uit)\xt,yt] — exp (— ^z T Y, w z}. Therefore the last expression in 
the chain becomes 



E [exp (iz T Ax t ) | y t ] exp (iz T Bg t {y t )) exp f --z T E w z 

It was proved above that, conditionally on y t , xt is Gaussian with mean x t \t and covariance 
matrix P t u. Hence 

E[exp(iz T Ax t )\y t ] = E [exp (i(A r z) r x t ) \y t ] 

= exp (i(A r z) r x t[t - ±(A T z) T P t{t (A T z) 

and consequently 

*(z) = exp Uz T Ax t \ t - ^z T AP t]t A r z\ x exp (iz T Bg t (y t )) exp f-^z^ 
= exp(iz T (Ax tlt + Bg t (y t )) - ^z T (AP t]t A T + E w 



which is the characteristic function of a Gaussian random vector with mean x t+ i\ t 
Ax t \ t + Bg t (y t ) and covariance matrix P t +i\t = AP t \ t A T + T, w > 0. 



Appendix B. 

1 /2 

Lemma 15. Consider the system (2.1a)-(2.1b), and suppose that (A, S«, ) be stabilizable 
and (A, C) be observable. In addition, assume that Po|o ^ 0- Then there exist constants 
p > and an integer T large enough such that 



%, [Ha* " &tf] = tr ( p t\t) < P for all t > T. 



(B.l) 



Proof. First, observe that 

Ki — 1 

A^ w (A l ) T 



i=0 



E!/ 2 AE!/ 2 ■ ■ ■ A*- 1 YL' i 



E!/ 2 ae!/ 2 • • • a* 1 -^ 1 / 2 



1/2 

and since (A, E w ) is controllable by Assumption 5- (hi), we see that there exists k\ S N+ 



such that for all k K\ the rank of 



a/2 , v i/2 



AEi; •••A 



Kl _l v l/2 



= n; indeed, ki is the 



reachability index of (A, Ei/ 2 ). Thus, J2i=o 1 ^^u>(A l ) T is positive definite, and therefore, 
there exists some 5[ , 6' 2 > such that 



Kl- 1 

j;/ sc J] A l Y, w (A l ) T ^ 5' 2 I. 

i=0 



(B.2) 



Second, observe that 



«2 — 1 

J] (^) T C ,T E~ 1 Cyl l = 

i=0 



c 

CA 



CA K 



C 

CA 



Here (g> denotes the standard Kronecker product. 
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Since (A, C) is observable by assumption, there exists k 2 € N + such that the rank of 



the matrix 



T 



C T A T C T ■ ■ ■ (A K2 ~ 1 ) T C T is n. The matrix I K2 ® E" 1 is clearly positive 
definite by Assumption 5'(iii), and therefore, we see that there exists 6", 5% > such that 

K2 — 1 

8'(I ^ {A i ) T C r T,- l CA i Sgl. (B.3) 

Third, the conditions of Lemma 7.1 in [Jazwinski, 1970, pp. 234] are satisfied, as (B.2) 
and (B.3) hold with Si = min{<5^,(5"} and 82 = max.{6' 2 , 6$}, and the bound P t u ^ p'l 
for some p' > is established for all t ^ T := max{Ki, k 2 }- The assertion now follows 
immediately from: 

% t [\\x t ~ xtf] = tr(P t[t ) nA max (P t , t ) < np' =: p. □ 



